cgit — URL Routing and Request Dispatch
Overview
cgit supports two URL schemes: virtual-root (path-based) and query-string.
Incoming requests are parsed into a cgit_query structure and dispatched to
one of 23 command handlers via a function pointer table.
Source files: cgit.c (querystring parsing, routing), parsing.c
(cgit_parse_url), cmd.c (command table).
URL Schemes
Virtual Root (Path-Based)
When virtual-root is configured, URLs use clean paths:
/cgit/ → repository list
/cgit/repo.git/ → summary
/cgit/repo.git/log/ → log (default branch)
/cgit/repo.git/log/main/path → log for path on branch main
/cgit/repo.git/tree/v1.0/src/ → tree view at tag v1.0
/cgit/repo.git/commit/?id=abc → commit view
The path after the virtual root is passed in PATH_INFO and parsed by
cgit_parse_url().
Query-String (CGI)
Without virtual root, all parameters are passed in the query string:
/cgit.cgi?url=repo.git/log/main/path&ofs=50
Query Structure
All parsed parameters are stored in ctx.qry:
struct cgit_query {
char *raw; /* raw URL / PATH_INFO */
char *repo; /* repository URL */
char *page; /* page/command name */
char *search; /* search string */
char *grep; /* grep pattern */
char *head; /* branch reference */
char *sha1; /* object SHA-1 */
char *sha2; /* second SHA-1 (for diffs) */
char *path; /* file/dir path within repo */
char *name; /* snapshot name / ref name */
char *url; /* combined URL path */
char *mimetype; /* requested MIME type */
char *etag; /* ETag from client */
int nohead; /* suppress header */
int ofs; /* pagination offset */
int has_symref; /* path contains a symbolic ref */
int has_sha1; /* explicit SHA was given */
int has_dot; /* path contains '..' */
int ignored; /* request should be ignored */
char *sort; /* sort field */
int showmsg; /* show full commit message */
int ssdiff; /* side-by-side diff */
int show_all; /* show all items */
int context; /* diff context lines */
int follow; /* follow renames */
int log_hierarchical_threading;
};
URL Parsing: cgit_parse_url()
In parsing.c, the URL is decomposed into repo, page, and path:
void cgit_parse_url(const char *url)
{
/* Step 1: try progressively longer prefixes as repo URLs */
/* For each '/' in the URL, check if the prefix matches a repo */
for (p = strchr(url, '/'); p; p = strchr(p + 1, '/')) {
*p = '\0';
repo = cgit_get_repoinfo(url);
*p = '/';
if (repo) {
ctx.qry.repo = xstrdup(url_prefix);
ctx.repo = repo;
url = p + 1; /* remaining part */
break;
}
}
/* if no '/' found, try the whole URL as a repo name */
/* Step 2: parse the remaining path as page/ref/path */
/* e.g., "log/main/src/file.c" → page="log", path="main/src/file.c" */
p = strchr(url, '/');
if (p) {
ctx.qry.page = xstrndup(url, p - url);
ctx.qry.path = trim_end(p + 1, '/');
} else if (*url) {
ctx.qry.page = xstrdup(url);
}
}
Query String Parsing: querystring_cb()
HTTP query parameters and POST form data are decoded by querystring_cb()
in cgit.c. The function maps URL parameter names to ctx.qry fields:
static void querystring_cb(const char *name, const char *value)
{
if (!strcmp(name, "url")) ctx.qry.url = xstrdup(value);
else if (!strcmp(name, "p")) ctx.qry.page = xstrdup(value);
else if (!strcmp(name, "q")) ctx.qry.search = xstrdup(value);
else if (!strcmp(name, "h")) ctx.qry.head = xstrdup(value);
else if (!strcmp(name, "id")) ctx.qry.sha1 = xstrdup(value);
else if (!strcmp(name, "id2")) ctx.qry.sha2 = xstrdup(value);
else if (!strcmp(name, "ofs")) ctx.qry.ofs = atoi(value);
else if (!strcmp(name, "path")) ctx.qry.path = xstrdup(value);
else if (!strcmp(name, "name")) ctx.qry.name = xstrdup(value);
else if (!strcmp(name, "mimetype")) ctx.qry.mimetype = xstrdup(value);
else if (!strcmp(name, "s")) ctx.qry.sort = xstrdup(value);
else if (!strcmp(name, "showmsg")) ctx.qry.showmsg = atoi(value);
else if (!strcmp(name, "ss")) ctx.qry.ssdiff = atoi(value);
else if (!strcmp(name, "all")) ctx.qry.show_all = atoi(value);
else if (!strcmp(name, "context")) ctx.qry.context = atoi(value);
else if (!strcmp(name, "follow")) ctx.qry.follow = atoi(value);
else if (!strcmp(name, "dt")) ctx.qry.dt = atoi(value);
else if (!strcmp(name, "grep")) ctx.qry.grep = xstrdup(value);
else if (!strcmp(name, "etag")) ctx.qry.etag = xstrdup(value);
}
URL Parameter Reference
| Parameter | Query Field | Type | Description |
|---|---|---|---|
url |
qry.url |
string | Full URL path (repo/page/path) |
p |
qry.page |
string | Page/command name |
q |
qry.search |
string | Search string |
h |
qry.head |
string | Branch/ref name |
id |
qry.sha1 |
string | Object SHA-1 |
id2 |
qry.sha2 |
string | Second SHA-1 (diffs) |
ofs |
qry.ofs |
int | Pagination offset |
path |
qry.path |
string | File path in repo |
name |
qry.name |
string | Reference/snapshot name |
mimetype |
qry.mimetype |
string | MIME type override |
s |
qry.sort |
string | Sort field |
showmsg |
qry.showmsg |
int | Show full commit message |
ss |
qry.ssdiff |
int | Side-by-side diff toggle |
all |
qry.show_all |
int | Show all entries |
context |
qry.context |
int | Diff context lines |
follow |
qry.follow |
int | Follow renames in log |
dt |
qry.dt |
int | Diff type |
grep |
qry.grep |
string | Grep pattern for log search |
etag |
qry.etag |
string | ETag for conditional requests |
Command Dispatch Table
The command table in cmd.c maps page names to handler functions:
#define def_cmd(name, want_hierarchical, want_repo, want_layout, want_vpath, is_clone) \
{#name, cmd_##name, want_hierarchical, want_repo, want_layout, want_vpath, is_clone}
static struct cgit_cmd cmds[] = {
def_cmd(atom, 1, 1, 0, 0, 0),
def_cmd(about, 0, 1, 1, 0, 0),
def_cmd(blame, 1, 1, 1, 1, 0),
def_cmd(blob, 1, 1, 0, 0, 0),
def_cmd(commit, 1, 1, 1, 1, 0),
def_cmd(diff, 1, 1, 1, 1, 0),
def_cmd(head, 1, 1, 0, 0, 1),
def_cmd(info, 1, 1, 0, 0, 1),
def_cmd(log, 1, 1, 1, 1, 0),
def_cmd(ls_cache,0, 0, 0, 0, 0),
def_cmd(objects, 1, 1, 0, 0, 1),
def_cmd(patch, 1, 1, 1, 1, 0),
def_cmd(plain, 1, 1, 0, 1, 0),
def_cmd(rawdiff, 1, 1, 0, 1, 0),
def_cmd(refs, 1, 1, 1, 0, 0),
def_cmd(repolist,0, 0, 1, 0, 0),
def_cmd(snapshot, 1, 1, 0, 0, 0),
def_cmd(stats, 1, 1, 1, 1, 0),
def_cmd(summary, 1, 1, 1, 0, 0),
def_cmd(tag, 1, 1, 1, 0, 0),
def_cmd(tree, 1, 1, 1, 1, 0),
};
Command Flags
| Flag | Meaning |
|---|---|
want_hierarchical |
Parse hierarchical path from URL |
want_repo |
Requires a repository context |
want_layout |
Render within HTML page layout |
want_vpath |
Accept a virtual path (file path in repo) |
is_clone |
HTTP clone protocol endpoint |
Lookup: cgit_get_cmd()
struct cgit_cmd *cgit_get_cmd(const char *name)
{
for (int i = 0; i < ARRAY_SIZE(cmds); i++)
if (!strcmp(cmds[i].name, name))
return &cmds[i];
return NULL;
}
The function performs a linear search. With 21 entries, this is fast enough.
Request Processing Flow
In process_request() (cgit.c):
1. Parse PATH_INFO via cgit_parse_url()
2. Parse QUERY_STRING via http_parse_querystring(querystring_cb)
3. Parse POST body (for authentication forms)
4. Resolve repository: cgit_get_repoinfo(ctx.qry.repo)
5. Determine command: cgit_get_cmd(ctx.qry.page)
6. If no page specified:
- With repo → default to "summary"
- Without repo → default to "repolist"
7. Check command flags:
- want_repo but no repo → "Repository not found" error
- is_clone and HTTP clone disabled → 404
8. Handle authentication if auth-filter is configured
9. Execute: cmd->fn(&ctx)
Hierarchical Path Resolution
When want_hierarchical=1, cgit splits ctx.qry.path into a reference
(branch/tag/SHA) and a file path. It tries progressively longer prefixes
of the path as git references until one resolves:
path = "main/src/lib/file.c"
try: "main" → found branch "main"
qry.head = "main"
qry.path = "src/lib/file.c"
If no prefix resolves, the entire path is treated as a file path within the default branch.
Clone Protocol Endpoints
Three commands serve the Git HTTP clone protocol:
| Endpoint | Path | Function |
|---|---|---|
info |
repo/info/refs |
cgit_clone_info() — advertise refs |
objects |
repo/objects/* |
cgit_clone_objects() — serve packfiles |
head |
repo/HEAD |
cgit_clone_head() — serve HEAD ref |
These are only active when enable-http-clone=1 (default).
URL Generation
ui-shared.c provides URL construction helpers:
const char *cgit_repourl(const char *reponame);
const char *cgit_fileurl(const char *reponame, const char *pagename,
const char *filename, const char *query);
const char *cgit_pageurl(const char *reponame, const char *pagename,
const char *query);
const char *cgit_currurl(void);
When virtual-root is set, these produce clean paths. Otherwise, they
produce query-string URLs.
Example URL generation:
/* With virtual-root=/cgit/ */
cgit_repourl("myrepo")
→ "/cgit/myrepo/"
cgit_fileurl("myrepo", "tree", "src/main.c", "h=dev")
→ "/cgit/myrepo/tree/src/main.c?h=dev"
cgit_pageurl("myrepo", "log", "ofs=50")
→ "/cgit/myrepo/log/?ofs=50"
Content-Type and HTTP Headers
The response content type is set by the command handler before generating output. Common types:
| Page | Content-Type |
|---|---|
| HTML pages | text/html |
| atom | text/xml |
| blob | auto-detected from content |
| plain | MIME type from extension or application/octet-stream |
| snapshot | application/x-gzip, etc. |
| patch | text/plain |
| clone endpoints | text/plain, application/x-git-packed-objects |
Headers are emitted by cgit_print_http_headers() in ui-shared.c before
any page content.
Error Handling
If a requested repository or page is not found, cgit renders an error page within the standard layout. HTTP status codes:
| Condition | Status |
|---|---|
| Normal page | 200 OK |
| Auth redirect | 302 Found |
| Not modified | 304 Not Modified |
| Bad request | 400 Bad Request |
| Auth required | 401 Unauthorized |
| Repo not found | 404 Not Found |
| Page not found | 404 Not Found |
The status code is set in ctx.page.status and emitted by the HTTP header
function.