11 KiB
Webserv - AI Coding Agent Instructions
Project Overview
A C++20 HTTP/1.1 web server implementing epoll-based event-driven architecture. Core components: configuration parser, HTTP request/response handling, CGI execution, static file serving, and routing.
Architecture Fundamentals
Event Loop & Request Flow
Client → epoll_wait → Server::handleEvent → Client → Router → Handler → Response
Critical pattern: The server uses a single epoll instance (Server::epoll_fd_) to multiplex I/O:
Server::run()contains the main event loop callingepoll_wait()with 10ms timeout- Events trigger specific handlers:
EPOLLIN→ request reading,EPOLLOUT→ response writing - Each
Clientmanages its own sockets and handler state machines - Sockets transition through states tracked in
ASocket::IoState(READ/WRITE)
Why this matters: All I/O is non-blocking. Never call blocking operations. Use socket->setIOState() and server.update(socket) to change epoll interest masks.
Configuration System
Three-tier hierarchy: GlobalConfig → ServerConfig → LocationConfig
Directive resolution: Uses inheritance with AConfig::get<T>(name) - searches current config, falls back to parent. Example:
auto maxBodySize = locationConfig->get<size_t>("client_max_body_size")
.value_or(serverConfig->get<size_t>("client_max_body_size").value_or(1048576));
Validation architecture: Two-stage validation in ConfigValidator:
- Structural rules (
AStructuralValidationRule): Check block-level requirements (e.g.,RequiredDirectivesRule) - Directive rules (
AValidationRule): Validate individual directive values (e.g.,PortValidationRule)
Rules are registered in ConfigValidator constructor and executed by ValidationEngine.
Context-aware directives: The DirectiveFactory uses a context string ("GSL" = Global/Server/Location) to restrict where directives can appear. Check DirectiveFactory::supportedDirectives when adding new directives.
CGI Execution Pipeline
Process model: fork() → pipe2() for stdin/stdout/stderr → execve() in child
Critical implementation details:
- Use
pipe2(O_CLOEXEC | O_NONBLOCK)- flags prevent fd leaks and blocking - Child process:
dup2()pipes to std streams, callLog::clearChannels()beforeexecve() - Parent: Wrap pipe fds in
CgiSocketobjects, register withClient::addSocket() - Environment:
CgiEnvironmentclass builds CGI/1.1 compliant env vars (required:GATEWAY_INTERFACE,SERVER_PROTOCOL,REQUEST_METHOD, etc.) - Timeout handling:
TimerSocketwithtimerfd_create()registered in epoll
State machine: CgiHandler writes request body → reads response headers → parses headers → reads body → waitpid(WNOHANG) to check status.
HTTP Request Parsing
State machine in HttpRequest::State: RequestLine → Headers → Body/Chunked → Complete/ParseError
Chunked transfer encoding: Implemented in parseBufferforChunkedBody():
- Read chunk size (hex) → validate → read chunk data → repeat until size=0
- Parse errors set
State::ParseErrorand callresponse.setError(400)
Critical validation: Host header is mandatory (HTTP/1.1). Checked in setState(State::Complete).
Build & Test System
Build Configuration
- CMake build types:
Release(default),Debug,ASAN(AddressSanitizer) - Makefile wrapper:
make release/debug/asanbuilds specific configurations - Environment detection: Makefile tracks container vs local builds in
build/.build-env, auto-cleans on switch
Test Commands
make test # Build + run unit tests (Google Test)
make test_verbose # Run with detailed output
make coverage # Generate coverage report (requires lcov or gcovr)
./webserv-tester/bin/run_tests.py [--suite SUITE] [--test TEST]
Test structure:
- Unit tests:
tests/directory, organized by component - Integration tests:
webserv-tester/Python test framework - Test config:
webserv-tester/data/conf/test.conf(port 8080)
Integration Testing with webserv-tester
The webserv-tester/ directory contains a comprehensive Python-based integration test framework that validates HTTP/1.1 compliance, configuration handling, and feature implementation.
Running the tester:
# Run all tests (automatically starts/stops server)
./run_test.sh
# Run specific test suite(s)
./run_test.sh basic
./run_test.sh http
./run_test.sh cgi
Available test suites (in webserv-tester/tests_suites/):
basic(basic_tests.py): Smoke tests for fundamental functionality (server start, static files, basic requests)http(http_tests.py): HTTP/1.1 protocol compliance (headers, status codes, chunked encoding, keep-alive, malformed requests)cgi(cgi_tests.py): CGI/1.1 execution (environment variables, stdin/stdout handling, timeouts, error handling)method(method_tests.py): HTTP method support per location (GET, POST, DELETE validation against config)config(config_tests.py): Configuration directives (inheritance, root, index, autoindex, error pages, redirects, location matching)invalid(invalid_config_tests.py): Error handling for malformed configs (missing directives, invalid contexts, syntax errors)upload(upload_tests.py): File upload functionalityuri(uri_tests.py): URI parsing and handlingredirect(redirect_tests.py): HTTP redirect handlingcookie(cookie_tests.py): Cookie handlingsecurity(security_tests.py): Security-related testsperformance(performance_tests.py): Performance benchmarks
Test framework architecture:
core/test_case.py: Base class for all tests with assertion helperscore/server_manager.py: Manages server process lifecycle (start/stop/restart)core/test_runner.py: HTTP request utilities and response validationdata/conf/test.conf: Test server configuration (port 8080, multiple locations)data/www/: Test web content (HTML, CGI scripts, static files)
Writing new tests: Tests inherit from TestCase class and follow this pattern:
class MyTests(TestCase):
def test_my_feature(self):
response = self.runner.send_request('GET', '/path')
self.assert_equals(response.status_code, 200, "Expected 200 OK")
self.assert_true('Content-Type' in response.headers, "Missing header")
Find test source in webserv-tester/tests_suites/ to understand test scenarios or add new tests for your features.
Code Conventions
Include Order (enforced by .clang-format)
- Own header (
"Class.hpp") - Project headers (
<webserv/path/Header.hpp>) - C++ standard library (
<string>) - C headers (
<unistd.h>)
Logging Pattern
Use Log::trace(LOCATION) at function entry for debugging. Available levels: trace, debug, info, warning, error, fatal.
Important: Always log before throwing exceptions or returning errors.
Error Handling
- HTTP errors: Call
ErrorHandler::createErrorResponse(statusCode, response, config)- handles custom error pages - Validation errors: Throw
RequestValidator::ValidationException{statusCode}in Router - Config errors: Throw
std::runtime_errorwith descriptive message during parsing - CGI errors: Check
cgiProcess_->getExitCode(), setresponse.setStatus(500)if non-zero
Memory Management
- Use
std::unique_ptrfor ownership (e.g.,ClientownsClientSocket) - Pass raw pointers for non-owning references (e.g.,
Server&inClient) - Socket ownership:
ServerownsServerSocket,ClientownsClientSocketandCgiSocket
Common Patterns & Gotchas
Adding a New Handler
- Inherit from
AHandler, implementhandle()andhandleTimeout() - Register in
Router::handleRequest()based on URI properties - Use
startTimer()from base class if operation may block - Set response complete:
response_.setComplete()
Adding a Configuration Directive
- Add to
DirectiveFactory::supportedDirectiveswith context string - Create validation rule implementing
AValidationRule - Register in
ConfigValidatorconstructor:engine_->addServerRule(name, std::make_unique<Rule>()) - Access in code:
config->get<Type>("directive_name")
Socket State Management
Critical: After modifying socket interest (read→write or vice versa):
socket->setIOState(ASocket::IoState::WRITE);
socket->markDirty(); // Flags for epoll update
// Server polls dirty sockets in pollSockets() and calls update()
URI Resolution
URI class handles path resolution:
matchConfig(): Longest prefix match for location blocksgetFullPath(): Resolves root + location path + request pathisCgi(): Checks if path matchescgi_extdirectiveisRedirect(): Checks for redirect directive
Testing Best Practices
Unit Test Structure
Follow GTest patterns in tests/:
- Use test fixtures inheriting from
::testing::Test - Name tests descriptively:
TEST_F(ClassTest, MethodName_Scenario_ExpectedBehavior) - One assertion per logical check
- Mock external dependencies (sockets, file I/O)
Integration Test Organization
webserv-tester/tests_suites/ contains:
basic_tests.py: Smoke tests (server start, static files)http_tests.py: Protocol compliance (headers, status codes, chunked encoding)cgi_tests.py: CGI execution and environment variablesmethod_tests.py: HTTP method support per locationconfig_tests.py: Directive inheritance and validationinvalid_config_tests.py: Error handling for malformed configs
Key Files Reference
webserv/main.cpp: Entry point, signal handlingwebserv/server/Server.{hpp,cpp}: Event loop, epoll managementwebserv/client/Client.{hpp,cpp}: Per-connection statewebserv/config/ConfigManager.hpp: Singleton config accesswebserv/config/validation/ConfigValidator.cpp: Validation rule registrationwebserv/router/Router.cpp: Request routing logicwebserv/handler/CgiProcess.cpp: fork/exec implementationwebserv/http/HttpRequest.cpp: State machine for parsingCMakeLists.txt: Build configuration and test setup
Debugging Tips
AddressSanitizer Build
make asan
./build/webserv config/default.conf
Use for memory leaks, use-after-free, double-free detection.
CGI Debugging
CGI child process stderr goes to CgiSocket read in CgiHandler::error(). Check logs for script output.
Epoll Issues
Enable trace logging: modify Log::setLevel(Log::Level::TRACE) in main.cpp. Watch for socket fd lifecycle in logs.
Config Validation
Run ConfigValidator checks before starting server. Errors print to stderr with context (global/server/location and directive name).