I asked the ChatGPT neural network, which has already been called revolutionary because of its broad outlook and ability to give accurate answers to any request - from composing music to writing program code, to imagine that it created a site and forgot to remove files from the root folder that could be of the greatest value for attackers.

image
Often, developers (perhaps due to their inattention) forget files in the root folders that can serve as a tool for attackers to hack a site or steal important information. These can be database copies, configuration files, or even source code files. Bug hunters periodically discover such vulnerabilities and send reports to bug bounty programs (programs for finding vulnerabilities for a fee).


Vulnerability in one of the subdomains of the QIWI payment system
The reward, as a rule, depends on the level of danger of the files found. For example, as part of the bug bounty program of the financial services provider QIWI, the researcher received $50 for finding fragments of source files, in another case, the bughunter was given a payment of $1000 for the found .git folder, which is used in development.

Report to the QIWI program

The most dangerous files according to ChatGPT
The root folder is the part after the first slash in the site address. In simple terms, the attack on the root of the site looks like this: https://example.com/[part_of_the_URL_looked_by_the_intruder] . Its effectiveness directly depends on the list used by hackers (in our case, security researchers). The more relevant and complete the list, the higher the chances of finding the files left by the developers.

If you directly ask ChatGPT for the names of potentially dangerous files, it will not respond. The conditions of OpenAI, the developer of the neural network, prohibit its use for malicious purposes, so ChatGPT will never teach hacking. However, if hackers get creative, they can get some useful information, in particular a list of the most important and common files in the root of sites. The list generated by ChatGPT for this request contains 5,000 files, including config.backup (which can store important site configuration information) and test-odbc.php (a file with testing the connection to the database).

The contents of one of the versions of the ChatGPT-fuzz.txt file

Some of the files it contains may have already been mentioned in the published fuzz.txt lists, and the researchers, as a rule, compiled them manually. This proves that the neural network has the right train of thought. For example, there is a similar white hacker Bo0oM sheet on GitHub . ChatGPT-fuzz.txt matches it by no more than 354 lines, otherwise it contains new files that were not previously seen by bug hunters.

The uniqueness of this list is that ChatGPT provides information based on the analysis of a huge array of data from the Internet. A person physically cannot analyze the same amount of information as a neural network.

How ChatGPT collected filenames for a list

Developers need to check sites before hackers arrive
It is often said that recognizing a problem is half the solution. Website developers should think in advance about what they store in the root folder, and already now remove unnecessary files from there, and hide the necessary ones better. This will help prevent problems in the future. For example, what could be dangerous about leaking logs? Let's look at the example of a real case, which we managed to find using ChatGPT-fuzz.txt.

Log leak on one of the domains, identified using ChatGPT-fuzz.txt

In this case, the webhook_sms_log.txt file "leaked" the personal data of users, including mobile numbers and home addresses. One can only guess how attackers would use this information if they discovered the vulnerability first.

In addition, developers should not use predictable filenames such as test.php or config.txt. Another recommendation that is relevant for both developers and information security specialists is to periodically re-audit sites. If you are responsible for the security of services, then you must tell the developers what the consequences for the company may be if they do not follow simple rules.

Conclusion
The potential of ChatGPT has not yet been fully exploited. Only now, information security specialists have an understanding of what kind of tool it is and how it can be used in work tasks. In addition to the names of the most dangerous files, using the neural network, you can determine popular headers, cookies, site configuration settings, and much more.

It is important that as many bug hunters as possible start relying on lists similar to ChatGPT-fuzz.txt when searching for vulnerabilities . This will increase the security of the services that we use every day, and thereby make our world safer.