directory traversal

What is directory traversal?

Directory traversal is a type of HTTP exploit in which a hacker uses the software on a web server to access data in a directory other than the server’s root directory. If the attempt is successful, the threat actor can view restricted files or execute commands on the server.

This type of attack is commonly performed using web browsers. Any server that fails to validate input data from web browsers is vulnerable to a directory traversal attack.

Directory traversal is also known as directory climbing, backtracking and file path traversal vulnerabilities. Directory traversal is similar to Structured Query Language injection and cross-site scripting in that they all involve code injection.

IT security professionals minimize the risk of a directory traversal with the following techniques:

careful web server programming;
installation of software updates and patches;
filtering of input from browsers; and
using vulnerability scanners.

How does directory traversal work?

Hackers use guesswork to find paths to restricted files on a web server. However, a skilled hacker can search the directory tree and easily execute this type of attack on an inadequately protected server.

Only a few resources are needed to perform a directory traversal attack, including the following ones:

access to a web browser;
some knowledge about where to find directories; and
basic knowledge of Hypertext Transfer Protocol (HTTP) requests.

What can an attacker do with directory traversal?

Once attackers access the root directory, they can enter other parts of the computer system. They may also be able to read and write arbitrary files on the server, enabling them to manipulate applications and associated data, read sensitive information like password files or take control of the server. Normally, users are unable to access any files outside of the web root folder.

Attackers can also gain control of access control lists (ACLs), which administrators use to grant various levels of file access to users. With access to ACLs, attackers can impersonate privileged users in the system to inflict damage.

How to check for directory traversal vulnerabilities

Here are two ways to manually check for directory traversal vulnerabilities:

Input vector enumeration. Enumeration tells the tester which parts of a web application could be vulnerable to attempts to bypass input validation. The tester identifies parts of an application that accept user input, including POST and GET calls, file uploads and Hypertext Markup Language forms.
Common patterns. Security pros can look for common patterns in the application’s URL structure to identify directory traversal vulnerabilities. If an application uses a querystring parameter to specify the file path, an attacker may be able to manipulate this parameter to access files outside of the intended path. Search engines can also be used to find URLs that are likely to have file names included in them.

[embedded content]

Automated tools are also used to check for traversal vulnerabilities. These tools perform the following testing techniques:

Static application security testing. Static testing reviews source code for vulnerabilities while the application is not running.
Dynamic application security testing. Dynamic testing tools review code for vulnerabilities while the application is running. This is done through the front end of the application using black box testing without ever accessing the source code. Security checks are performed while the application code is running, as if it were being used. One of these tests is fuzz testing, which submits malformed data to uncover directory traversal vulnerabilities.

diagram showing white box testing vs. black box testing — White box testing enables testers to pick out vulnerabilities from the coder’s perspective. Static testing is white box testing.

How to prevent directory traversal attacks

The most effective way to prevent these sorts of path traversal attacks is to avoid passing user input to file system application programming interfaces (APIs). Insufficient browser filtering and user input can leave web applications and web server files vulnerable to traversal attacks.

If passing user input to the file system APIs can’t be avoided, here are other measures that can help prevent directory traversal:

Sanitize user input. Sanitizing user input ensures that only what is supposed to be submitted ends up being sent to the server. Validated input should ideally be compared against an allowlist of permitted input values, such as a list of permitted strings. If this isn’t possible, then the application should only permit certain single characters — alphanumeric characters, for example.
Update web server software. Security administrators should install all updates and patches so that attackers can’t exploit known vulnerabilities.
Segregate documents. Admins should also use cloud storage or host documents on a separate file server so that directories with sensitive material are kept apart from public information directories.
Use content management software. CMS software is a safe way to enable nontechnical users to upload large volumes of content and act like administrators. These users typically do not access the raw URL paths of the documents.
Use indexes. It’s safer to use indexes rather than raw file names in URLs. Indexes add a layer of abstraction between the hacker and the files because an index does not give a hacker direct access to the file, the way the raw file name does.

Examples of directory traversal attacks

One way to perform directory traversal is to send URLs that contain the file name, plus various escape codes, to the server. This lets the attacker work around filtered characters. Using escape codes requires the attacker to guess which commands in a URL might be blocked, but that is not an impossible goal.

The following escape codes are ones that bypass URLs that block ../ commands:

%2e%2e/
%2e%2e%2f

Escape codes contain hexadecimal encodings of the blocked bits of the URL. %2f is hexadecimal for the forward slash, and %2e is hexadecimal for the period. If a website isn’t configured to block the escape codes, an unauthorized user could type in this URL http://www.example.com/..%2f..%2f..%2fetc%2fpword in order to access this URL http://www.example.com/../../../etc/pword.

Another common way that hackers bypass file validation routines is by inserting null bytes into file names. For example, the phrase %00 is a null byte that can be injected to confuse a system when reading a file name. If an attacker sends the parameter ?file=protected.doc%00.pdf, a Java application sees a file name ending in .pdf, whereas an operating system sees a file ending in .doc. If the system is configured to block one but let the other through, this has a chance of bypassing validation routines.

Directory traversal attacks are among the many security risks IT professionals face. Learn how modern cloud security best practices can help defend against these and other common cyberthreats.