{"id":176,"date":"2026-03-21T09:03:32","date_gmt":"2026-03-21T09:03:32","guid":{"rendered":"https:\/\/medlearn.imperial.ac.uk\/innovation\/projects\/medlearn-sharepoint-qa-agent\/"},"modified":"2026-04-09T14:36:03","modified_gmt":"2026-04-09T14:36:03","slug":"medlearn-sharepoint-qa-agent","status":"publish","type":"page","link":"https:\/\/medlearn.imperial.ac.uk\/innovation\/projects\/medlearn-sharepoint-qa-agent\/","title":{"rendered":"MedLearn SharePoint Q&#038;A Agent"},"content":{"rendered":"\n<!-- BANNER -->\n<div style=\"width:100%;margin:0 0 36px 0;border-radius:10px;overflow:hidden;\">\n  <svg viewBox=\"0 0 1400 460\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"display:block;width:100%;height:auto;\">\n    <defs>\n      <linearGradient id=\"sp1\" x1=\"0%\" y1=\"0%\" x2=\"100%\" y2=\"100%\">\n        <stop offset=\"0%\" style=\"stop-color:#001E45;stop-opacity:1\"\/>\n        <stop offset=\"60%\" style=\"stop-color:#003E74;stop-opacity:1\"\/>\n        <stop offset=\"100%\" style=\"stop-color:#1a6b5a;stop-opacity:1\"\/>\n      <\/linearGradient>\n    <\/defs>\n    <rect width=\"1400\" height=\"460\" fill=\"url(#sp1)\"\/>\n    <!-- Grid lines -->\n    <line x1=\"0\" y1=\"115\" x2=\"1400\" y2=\"115\" stroke=\"#ffffff\" stroke-opacity=\"0.04\"\/>\n    <line x1=\"0\" y1=\"230\" x2=\"1400\" y2=\"230\" stroke=\"#ffffff\" stroke-opacity=\"0.04\"\/>\n    <line x1=\"0\" y1=\"345\" x2=\"1400\" y2=\"345\" stroke=\"#ffffff\" stroke-opacity=\"0.04\"\/>\n    <line x1=\"350\" y1=\"0\" x2=\"350\" y2=\"460\" stroke=\"#ffffff\" stroke-opacity=\"0.04\"\/>\n    <line x1=\"700\" y1=\"0\" x2=\"700\" y2=\"460\" stroke=\"#ffffff\" stroke-opacity=\"0.04\"\/>\n    <line x1=\"1050\" y1=\"0\" x2=\"1050\" y2=\"460\" stroke=\"#ffffff\" stroke-opacity=\"0.04\"\/>\n    <!-- Right-side diagram: document flow into agent -->\n    <rect x=\"980\" y=\"90\" width=\"160\" height=\"44\" rx=\"6\" fill=\"#ffffff\" fill-opacity=\"0.07\"\/>\n    <rect x=\"987\" y=\"100\" width=\"60\" height=\"6\" rx=\"3\" fill=\"#4CAF92\" fill-opacity=\"0.7\"\/>\n    <rect x=\"987\" y=\"112\" width=\"100\" height=\"4\" rx=\"2\" fill=\"#ffffff\" fill-opacity=\"0.25\"\/>\n    <rect x=\"987\" y=\"120\" width=\"80\" height=\"4\" rx=\"2\" fill=\"#ffffff\" fill-opacity=\"0.18\"\/>\n    <rect x=\"980\" y=\"148\" width=\"160\" height=\"44\" rx=\"6\" fill=\"#ffffff\" fill-opacity=\"0.07\"\/>\n    <rect x=\"987\" y=\"158\" width=\"55\" height=\"6\" rx=\"3\" fill=\"#4CAF92\" fill-opacity=\"0.7\"\/>\n    <rect x=\"987\" y=\"170\" width=\"95\" height=\"4\" rx=\"2\" fill=\"#ffffff\" fill-opacity=\"0.25\"\/>\n    <rect x=\"987\" y=\"178\" width=\"70\" height=\"4\" rx=\"2\" fill=\"#ffffff\" fill-opacity=\"0.18\"\/>\n    <rect x=\"980\" y=\"206\" width=\"160\" height=\"44\" rx=\"6\" fill=\"#ffffff\" fill-opacity=\"0.07\"\/>\n    <rect x=\"987\" y=\"216\" width=\"70\" height=\"6\" rx=\"3\" fill=\"#4CAF92\" fill-opacity=\"0.7\"\/>\n    <rect x=\"987\" y=\"228\" width=\"90\" height=\"4\" rx=\"2\" fill=\"#ffffff\" fill-opacity=\"0.25\"\/>\n    <rect x=\"987\" y=\"236\" width=\"65\" height=\"4\" rx=\"2\" fill=\"#ffffff\" fill-opacity=\"0.18\"\/>\n    <!-- Arrow into agent box -->\n    <line x1=\"1060\" y1=\"260\" x2=\"1060\" y2=\"290\" stroke=\"#4CAF92\" stroke-width=\"2\" stroke-opacity=\"0.6\"\/>\n    <polygon points=\"1055,288 1065,288 1060,298\" fill=\"#4CAF92\" fill-opacity=\"0.7\"\/>\n    <!-- Agent response box -->\n    <rect x=\"960\" y=\"300\" width=\"200\" height=\"100\" rx=\"8\" fill=\"#4CAF92\" fill-opacity=\"0.18\" stroke=\"#4CAF92\" stroke-opacity=\"0.4\" stroke-width=\"1\"\/>\n    <text x=\"1060\" y=\"340\" text-anchor=\"middle\" font-family=\"Georgia,serif\" font-size=\"11\" fill=\"#4CAF92\" fill-opacity=\"0.9\">AI Agent<\/text>\n    <rect x=\"975\" y=\"352\" width=\"110\" height=\"4\" rx=\"2\" fill=\"#ffffff\" fill-opacity=\"0.35\"\/>\n    <rect x=\"975\" y=\"362\" width=\"90\" height=\"4\" rx=\"2\" fill=\"#ffffff\" fill-opacity=\"0.25\"\/>\n    <rect x=\"975\" y=\"372\" width=\"130\" height=\"4\" rx=\"2\" fill=\"#ffffff\" fill-opacity=\"0.2\"\/>\n    <!-- SharePoint icon hint -->\n    <text x=\"1155\" y=\"125\" text-anchor=\"middle\" font-family=\"Georgia,serif\" font-size=\"10\" fill=\"#ffffff\" fill-opacity=\"0.3\">SharePoint<\/text>\n    <!-- Left text -->\n    <text x=\"100\" y=\"185\" font-family=\"Georgia,serif\" font-size=\"52\" font-weight=\"700\" fill=\"#ffffff\">MedLearn<\/text>\n    <text x=\"100\" y=\"245\" font-family=\"Georgia,serif\" font-size=\"52\" font-weight=\"700\" fill=\"#ffffff\">SharePoint<\/text>\n    <text x=\"100\" y=\"305\" font-family=\"Georgia,serif\" font-size=\"52\" font-weight=\"700\" fill=\"#4CAF92\">Q&amp;A Agent<\/text>\n    <text x=\"100\" y=\"340\" font-family=\"Georgia,serif\" font-size=\"16\" fill=\"#ffffff\" fill-opacity=\"0.7\">Restricted-folder document querying \u00e2\u0080\u0094 POC<\/text>\n    <!-- Tags -->\n    <rect x=\"100\" y=\"362\" width=\"52\" height=\"22\" rx=\"11\" fill=\"#ffffff\" fill-opacity=\"0.12\"\/>\n    <text x=\"126\" y=\"377\" text-anchor=\"middle\" font-family=\"Georgia,serif\" font-size=\"11\" fill=\"#ffffff\" fill-opacity=\"0.85\">AI<\/text>\n    <rect x=\"162\" y=\"362\" width=\"88\" height=\"22\" rx=\"11\" fill=\"#ffffff\" fill-opacity=\"0.12\"\/>\n    <text x=\"206\" y=\"377\" text-anchor=\"middle\" font-family=\"Georgia,serif\" font-size=\"11\" fill=\"#ffffff\" fill-opacity=\"0.85\">Integration<\/text>\n    <rect x=\"260\" y=\"362\" width=\"46\" height=\"22\" rx=\"11\" fill=\"#4CAF92\" fill-opacity=\"0.3\"\/>\n    <text x=\"283\" y=\"377\" text-anchor=\"middle\" font-family=\"Georgia,serif\" font-size=\"11\" fill=\"#4CAF92\">POC<\/text>\n  <\/svg>\n<\/div>\n\n<p style=\"font-size:17px;line-height:1.75;color:#2d3748;margin:0 0 32px;\">A proof-of-concept to let authorised MedLearn staff query <strong>specific SharePoint folders<\/strong> \u00e2\u0080\u0094 SOPs, how-tos, and technical notes \u00e2\u0080\u0094 using natural language, and receive answers with source citations. Requested by Alex Furr (DEO Lead), initially scoped for five named users.<\/p>\n\n<!-- META TABLE -->\n<table style=\"width:100%;border-collapse:collapse;margin:0 0 40px;font-size:15px;\">\n  <tr style=\"border-bottom:1px solid #e8ecf2;\">\n    <td style=\"padding:14px 16px 14px 0;width:180px;font-weight:700;color:#003E74;vertical-align:top;\">Status<\/td>\n    <td style=\"padding:14px 0;color:#2d3748;\">POC \u00e2\u0080\u0094 in development<\/td>\n  <\/tr>\n  <tr style=\"border-bottom:1px solid #e8ecf2;\">\n    <td style=\"padding:14px 16px 14px 0;font-weight:700;color:#003E74;vertical-align:top;\">Requester<\/td>\n    <td style=\"padding:14px 0;color:#2d3748;\">Alex Furr, DEO Lead \u00e2\u0080\u0094 Digital Education Office<\/td>\n  <\/tr>\n  <tr style=\"border-bottom:1px solid #e8ecf2;\">\n    <td style=\"padding:14px 16px 14px 0;font-weight:700;color:#003E74;vertical-align:top;\">Team<\/td>\n    <td style=\"padding:14px 0;color:#2d3748;\">Adrian Cowell<\/td>\n  <\/tr>\n  <tr style=\"border-bottom:1px solid #e8ecf2;\">\n    <td style=\"padding:14px 16px 14px 0;font-weight:700;color:#003E74;vertical-align:top;\">Tech stack<\/td>\n    <td style=\"padding:14px 0;color:#2d3748;\">Python, Microsoft Graph API, Microsoft Entra ID (OAuth2), RAG \/ vector embeddings, WordPress plugin<\/td>\n  <\/tr>\n  <tr style=\"border-bottom:1px solid #e8ecf2;\">\n    <td style=\"padding:14px 16px 14px 0;font-weight:700;color:#003E74;vertical-align:top;\">Users<\/td>\n    <td style=\"padding:14px 0;color:#2d3748;\">~5 named Imperial staff (DEO team)<\/td>\n  <\/tr>\n  <tr>\n    <td style=\"padding:14px 16px 14px 0;font-weight:700;color:#003E74;vertical-align:top;\">GitHub<\/td>\n    <td style=\"padding:14px 0;\"><a href=\"https:\/\/github.com\/adrianImperial\/medlearn-Sharepoint\" target=\"_blank\" style=\"color:#003E74;\">github.com\/adrianImperial\/medlearn-Sharepoint<\/a><\/td>\n  <\/tr>\n<\/table><h2 style=\"font-size:22px;font-weight:700;color:#001E45;margin:0 0 14px;padding-bottom:10px;border-bottom:2px solid #e8ecf2;\">The Challenge<\/h2>\n<p style=\"font-size:15px;line-height:1.75;color:#2d3748;margin:0 0 20px;\">The MedLearn and DEO teams maintain a growing body of operational knowledge \u00e2\u0080\u0094 SSH commands, deployment procedures, platform SOPs, infrastructure how-tos \u00e2\u0080\u0094 spread across SharePoint folders. Finding a specific piece of information requires knowing which folder to look in, navigating SharePoint&#8217;s interface, and reading through documents manually.<\/p>\n<p style=\"font-size:15px;line-height:1.75;color:#2d3748;margin:0 0 36px;\">The request is simple in concept: <em>point an agent at those folders and ask it questions<\/em>. The implementation involves careful decisions around permissions, data governance, and where processing can take place \u00e2\u0080\u0094 particularly given Imperial&#8217;s M365 environment and information security requirements.<\/p>\n\n<h2 style=\"font-size:22px;font-weight:700;color:#001E45;margin:0 0 14px;padding-bottom:10px;border-bottom:2px solid #e8ecf2;\">What It Will Do<\/h2>\n<p style=\"font-size:15px;line-height:1.75;color:#2d3748;margin:0 0 16px;\">When complete, the agent will:<\/p>\n<ul style=\"font-size:15px;line-height:1.9;color:#2d3748;padding-left:22px;margin:0 0 20px;\">\n  <li>Index documents from <strong>specified SharePoint folders only<\/strong> \u00e2\u0080\u0094 no access beyond the defined scope<\/li>\n  <li>Extract and chunk text from docx, pdf, and pptx files<\/li>\n  <li>Answer natural-language queries (e.g. <em>&#8220;What is the SSH command to connect to MedLearn prod?&#8221;<\/em>) with <strong>verbatim supporting snippets and links back to the source file<\/strong><\/li>\n  <li>Respect each user&#8217;s existing SharePoint permissions via delegated Microsoft Entra ID authentication \u00e2\u0080\u0094 no user can retrieve documents they couldn&#8217;t already access<\/li>\n  <li>Log queries and cited sources for audit purposes<\/li>\n<\/ul>\n<p style=\"font-size:15px;line-height:1.75;color:#2d3748;margin:0 0 36px;\">A second surface \u00e2\u0080\u0094 a Microsoft Teams bot or Copilot Studio agent \u00e2\u0080\u0094 is in scope as a follow-on once the core service is validated.<\/p>\n\n<h2 style=\"font-size:22px;font-weight:700;color:#001E45;margin:0 0 14px;padding-bottom:10px;border-bottom:2px solid #e8ecf2;\">Architecture<\/h2>\n\n<div style=\"display:grid;grid-template-columns:1fr 1fr;gap:16px;margin:0 0 32px;\">\n  <div style=\"background:#f0f7f4;border-left:3px solid #4CAF92;border-radius:0 8px 8px 0;padding:18px 20px;\">\n    <p style=\"font-size:12px;font-weight:700;letter-spacing:0.8px;text-transform:uppercase;color:#4CAF92;margin:0 0 8px;\">Auth<\/p>\n    <p style=\"font-size:14px;line-height:1.65;color:#1a3a2a;margin:0;\">Delegated Microsoft Entra ID (OAuth2 authorisation code flow). Each user signs in \u00e2\u0080\u0094 the agent uses their Graph token and inherits their SharePoint permissions.<\/p>\n  <\/div>\n  <div style=\"background:#f0f7f4;border-left:3px solid #4CAF92;border-radius:0 8px 8px 0;padding:18px 20px;\">\n    <p style=\"font-size:12px;font-weight:700;letter-spacing:0.8px;text-transform:uppercase;color:#4CAF92;margin:0 0 8px;\">Ingestion<\/p>\n    <p style=\"font-size:14px;line-height:1.65;color:#1a3a2a;margin:0;\">Microsoft Graph API enumerates target folders, downloads file content, extracts text, chunks it, generates embeddings, and stores with metadata (filename, SharePoint URL, last modified).<\/p>\n  <\/div>\n  <div style=\"background:#f0f7f4;border-left:3px solid #4CAF92;border-radius:0 8px 8px 0;padding:18px 20px;\">\n    <p style=\"font-size:12px;font-weight:700;letter-spacing:0.8px;text-transform:uppercase;color:#4CAF92;margin:0 0 8px;\">Retrieval &amp; Answer<\/p>\n    <p style=\"font-size:14px;line-height:1.65;color:#1a3a2a;margin:0;\">Query is embedded and matched against the vector store. Top chunks are passed to an LLM with a strict citation requirement. If confidence is low, the agent says so and returns the most relevant documents instead.<\/p>\n  <\/div>\n  <div style=\"background:#f0f7f4;border-left:3px solid #4CAF92;border-radius:0 8px 8px 0;padding:18px 20px;\">\n    <p style=\"font-size:12px;font-weight:700;letter-spacing:0.8px;text-transform:uppercase;color:#4CAF92;margin:0 0 8px;\">MedLearn UI<\/p>\n    <p style=\"font-size:14px;line-height:1.65;color:#1a3a2a;margin:0;\">A protected WordPress page with a chat interface. Calls the retrieval service and renders answers with cited filenames and SharePoint links. SSO-gated to authorised users only.<\/p>\n  <\/div>\n<\/div>\n\n<h2 style=\"font-size:22px;font-weight:700;color:#001E45;margin:0 0 14px;padding-bottom:10px;border-bottom:2px solid #e8ecf2;\">POC Scope &amp; Constraints<\/h2>\n<p style=\"font-size:15px;line-height:1.75;color:#2d3748;margin:0 0 16px;\">To keep the first build fast and governable, the POC is intentionally constrained:<\/p>\n<ul style=\"font-size:15px;line-height:1.9;color:#2d3748;padding-left:22px;margin:0 0 20px;\">\n  <li><strong>Folders:<\/strong> 2\u00e2\u0080\u00933 specific SharePoint folder URLs (to be confirmed by requester)<\/li>\n  <li><strong>Users:<\/strong> exactly 5 named Imperial staff \u00e2\u0080\u0094 access is not open<\/li>\n  <li><strong>Auth model:<\/strong> delegated (per-user sign-in, not a service account)<\/li>\n  <li><strong>File types:<\/strong> docx, pdf, pptx, txt\/md \u00e2\u0080\u0094 no OCR on scanned PDFs in v1<\/li>\n  <li><strong>Answer format:<\/strong> verbatim snippets + SharePoint file link + last-modified date<\/li>\n  <li><strong>Audit log:<\/strong> user ID, query, file URLs cited \u00e2\u0080\u0094 stored server-side<\/li>\n<\/ul>\n\n<div style=\"background:#fff8e6;border-left:4px solid #f0a500;border-radius:0 8px 8px 0;padding:18px 22px;margin:0 0 36px;\">\n  <p style=\"font-size:14px;font-weight:700;color:#7a5000;margin:0 0 6px;\">Pending from requester<\/p>\n  <p style=\"font-size:14px;line-height:1.65;color:#5a3c00;margin:0;\">Exact SharePoint folder URLs, confirmation of 5 authorised user accounts, and confirmation that external LLM use (outside M365) is approved by ICT\/InfoSec. These are required before the indexing service can be built.<\/p>\n<\/div>\n\n<h2 style=\"font-size:22px;font-weight:700;color:#001E45;margin:0 0 14px;padding-bottom:10px;border-bottom:2px solid #e8ecf2;\">Next Steps<\/h2>\n<ol style=\"font-size:15px;line-height:1.9;color:#2d3748;padding-left:22px;margin:0 0 40px;\">\n  <li>Confirm SharePoint folder URLs and authorised user list with Alex Furr<\/li>\n  <li>Confirm ICT\/InfoSec position on external LLM use vs Azure OpenAI<\/li>\n  <li>Register app in Microsoft Entra ID (Imperial tenant) \u00e2\u0080\u0094 delegated permissions<\/li>\n  <li>Build ingestion service (Graph \u00e2\u0086\u0092 text extraction \u00e2\u0086\u0092 embeddings \u00e2\u0086\u0092 vector store)<\/li>\n  <li>Build retrieval + answer layer with citation enforcement<\/li>\n  <li>Build WordPress chat UI page (SSO-gated)<\/li>\n  <li>Pilot with 5 users, gather feedback, assess Teams surface viability<\/li>\n<\/ol>\n","protected":false},"excerpt":{"rendered":"<p>AI Agent SharePoint MedLearn SharePoint Q&amp;A Agent Restricted-folder document querying \u00e2\u0080\u0094 POC AI Integration POC A proof-of-concept to let authorised MedLearn staff query specific SharePoint folders \u00e2\u0080\u0094 SOPs, how-tos, and [&hellip;]<\/p>\n","protected":false},"author":16,"featured_media":0,"parent":7,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-176","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/medlearn.imperial.ac.uk\/innovation\/wp-json\/wp\/v2\/pages\/176","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/medlearn.imperial.ac.uk\/innovation\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/medlearn.imperial.ac.uk\/innovation\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/medlearn.imperial.ac.uk\/innovation\/wp-json\/wp\/v2\/users\/16"}],"replies":[{"embeddable":true,"href":"https:\/\/medlearn.imperial.ac.uk\/innovation\/wp-json\/wp\/v2\/comments?post=176"}],"version-history":[{"count":2,"href":"https:\/\/medlearn.imperial.ac.uk\/innovation\/wp-json\/wp\/v2\/pages\/176\/revisions"}],"predecessor-version":[{"id":269,"href":"https:\/\/medlearn.imperial.ac.uk\/innovation\/wp-json\/wp\/v2\/pages\/176\/revisions\/269"}],"up":[{"embeddable":true,"href":"https:\/\/medlearn.imperial.ac.uk\/innovation\/wp-json\/wp\/v2\/pages\/7"}],"wp:attachment":[{"href":"https:\/\/medlearn.imperial.ac.uk\/innovation\/wp-json\/wp\/v2\/media?parent=176"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}